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Abstract —Massive multiple-input multiple-output (MIMO) is a 
promising approach for cellular communication due to its energy 
efficiency and high achievable data rate. These advantages, 
however, can be realized only when channel state information 
(CSI) is available at the transmitter. Since there are many 
antennas, CSI is too large to feed back without compression. 
To compress CSI, prior work has applied compressive sensing 
(CS) techniques and the fact that CSI can be sparsified. The 
adopted sparsifying bases fail, however, to refiect the spatial 
correlation and channel conditions or to be feasible in practice. In 
this paper, we propose a new sparsifying basis that refiects the 
long-term characteristics of the channel, and needs no change 
as long as the spatial correlation model does not change. We 
propose a new reconstruction algorithm for CS, and also suggest 
dimensionality reduction as a compression method. To feed back 
compressed CSI in practice, we propose a new codebook for 
the compressed channel quantization assuming no other-cell 
interference. Numerical results confirm that the proposed channel 
feedback mechanisms show better performance in point-to-point 
(single-user) and point-to-multi-point (multi-user) scenarios. 

Index Terms —MIMO system, multi-user system, channel feed¬ 
back, compressed feedback. 

1. Introduction 

The concept of multiple-input multiple-output (MIMO) 
wireless communication employing a number of antennas, 
a.k.a. massive MIMO, has been researched for several years. 
It was found that a base station (BS) with more antennas 
can recover information in lower signal-to-noise-ratio (SNR) 
when the number of antennas is sufficiently large With 
this motivation, the idea of using a very large number of 
antennas at the BS in a cellular system was proposed in 0- 
Massive MIMO systems are known to provide large network 
capacity gain by supporting many users 0, and higher energy 
efficiency Q. Practical issues, transmit precoding and receive 
post processing, and channel estimation issues for massive 
MIMO systems were discussed in 0 0 

A transmitter with multiple antennas has to exploit channel 
state information (CSI) to provide beamforming gains in 
single-user (SU) MIMO systems, and multiplexing gains in 
multi-user (MU) MIMO systems |[8|. With inaccurate CSI, 
however, there is sum-rate saturation even in massive MIMO 
systems fT0| . It is, therefore, important to design efficient 
channel estimation and feedback strategies. In time division 
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duplexing (TDD) systems, CSI can be implicitly obtained 
using reciprocity. In frequency division duplexing (FDD), 
which most of cellular systems employ nowadays, the receiver 
has to feed back information of channel state or precoding 
vectors. It is known that the feedback overhead must increase 
to maintain a certain level of CSI quantization loss (3- 
(El- From this point-of-view, it is essential to compress and 
quantize CSI efficiently due to the large number of antennas. 
To solve these issues, a feedback reduction technique that 
exploits spatial correlation of users was proposed in O’ 
and noncoherent trellis-coded quantization for FDD massive 
MIMO systems was proposed in In fT5| , fT^ , however, it 
was assumed that the spatial correlation matrices are perfectly 
available at transmitters. 

Compressive sensing (CS) based CSI compression was 
applied in O It uses the fact that CSI in massive MIMO 
systems has high spatial correlation due to the limited physical 
distance between antennas. The theory of CS |T^-|[2Q| has 
been applied in various areas including signal processing and 
communications, where the information is sparse. A sparse 
signal (or vector) is a signal that can be represented by 
few elements in a certain domain. Via random projections, 
CS is able to compress sparse information efficiently. With 
the insight that CSI can be represented in sparse form 
in a spatial-frequency domain, two sparsifying bases were 
adopted in fT7| : the two-dimensional discrete cosine transform 
(2D-DCT) and the instantaneous Karhunen-Loeve transform 
(KLT). Unlike flSl , (T^, there is no need to assume trans¬ 
mitters to know the correlation matrices in GD- Without this 
assumption, however, the 2D-DCT basis fails to refiect the 
spatial correlation of the systems. The instantaneous KLT basis 
changes as the channel varies, making it, in practice, unfeasi¬ 
ble. CS techniques simplify encoding, but require solving an 
optimization problem for decoding, thus demanding significant 
computing resources. 

In this paper, we propose two new compression methods for 
channel feedback in massive MIMO systems using the fact that 
highly correlated CSI can be represented in a sparse form. 
For a sparsifying basis, we adopt the KLT, which considers 
the long-term correlation model of the channel. The first 
method compresses via random projection, while the second 
one uses the sparsifying basis directly. The former method 
is useful when the receiver does not know what basis to use 
(Scenario 1), while the latter method is prefered when the 
receiver and the transmitter select what sparsifying basis to 
use (Scenario 2). To quantize the compressed CSI, we adopt 
the widely used Linde, Buzo, and Gray (LBG) algorithm 0’ 
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and random vector quantization (RVQ) p3| . The main 
contributions of this paper are as follows: 

• Compression method for Scenario 1 : Since the calculation 
of the KLT basis from the covariance matrix entails high 
complexity, the receiver might be unable to obtain the 
basis. In this case, CS technology that needs no infor¬ 
mation of the basis for compression but just compresses 
via random projections is applied. With the 2D-DCT 
or KLT basis, the indices of the dominant elements of 
the sparsified CSI are expected to be in certain region. 
Using this fact, our contribution in random projection- 
based compression is a new reconstruction algorithm 
with less complexity compared to conventional decoding 
algorithms. We show numerically that, compared to the 
conventional reconstruction method for CS-based com¬ 
pression, channel feedback with the proposed decoding 
algorithm performs better in terms of recovery accuracy, 
and achievable rate. 

• Compression method for Scenario 2: In Scenario 2, the 
transmitter and the receiver can choose what sparsifying 
basis they will use. If both the transmitter and the receiver 
have enough computing resourses to obtain the KLT 
basis, the KLT basis is adopted for sparsifying, and if not, 
the 2D-DCT basis is adopted. Thus, both the transmitter 
and the receiver can know the basis without any additive 
coordination. With the sparsified CSI, we propose to 
compress it by dimensionality reduction. The proposed 
method is simpler to compress and reconstruct when the 
position of the dominant elements in the sparsified CSI 
is expected to be focused on certain region. We show 
numerically that, compared to the CS-based compression, 
the proposed dimensionality-reduction-based compres¬ 
sion performs better regarding recovery accuracy, and 
achievable rate. 

• Codehook construction: For linear precoding, the Grass- 
mannian codebook has been widely used p4| , mostly, in 
single-user MIMO scenarios. The Grassmannian code¬ 
book, however, can only cover one-norm vectors, fail¬ 
ing to fit the compressed CSI from either compression 
method for MU MIMO scenarios. Therefore, we adopt 
the LEG algorithm, which exploits the statistical proper¬ 
ties of the compressed CSI, to generate a codebook. We 
analyze how the compressed CSI vectors are distributed 
and construct a codebook based on our analysis. 


To simplify analysis, we assume that the receiver can 
estimate perfect CSI without any noise and/or other-cell in¬ 
terference, and that there is an ideal control channel that can 
send, without errors, the compressed CSI. Also, we assume 
that spatial correlation is obtainable at the transmitter or at 
the receiver according to each scenario with no error. This 
paper is organized as follows. In Section |I^ and III we 
introduce the system model for massive multi-user MIMO 
systems, and a review of sparse signal compression including 
CS and dimensionaltiy reduction. In Section IV we explain the 
sparsifying bases, and the details of the compression methods 
with given bases. We also introduce a codebook generation 
rule. Performance analysis and our conclusion are given in 




(a) One-dimensional ULA (b) Two-dimensional UPA 

Figure 1: Geometry and correlations of an ULA and an UPA. 


Sections |V]and|Vll 


11. System Model 


In this section, we explain the system model and the 
assumptions]^ Consider a MIMO broadcast signal model with 
receivers with receive antennas. Each user receives its 
own data stream, which is precoded at the transmitter with 
Nt antennas. We consider two types of antenna arrays: an 
one-dimensional uniform linear array (ULA) model, and a 
two-dimensional uniform planar array (UPA) model. Figure 
illustrates how arrays are designed. In Figure we note 
that the correlation decreases with the distance. Note that our 
algorithms work well regardless of channel correlation models. 
In this paper, we do not consider Doppler. 

For the one-dimensional ULA model, the 3GPP Spatial 
Channel Model (SCM) is adopted j^. To obtain the KLT 
basis ^klt, 1000 channel vectors h are generated to calculate 
the covariance Ch for an each transmitter-receiver-link. 

The two-dimensional UPA (A/y x H) model can be extended 
from the ULA model | [26| . For the UPA model, we employ the 
Kronecker model to express the N^. x spatially-correlated 
MIMO channel matrix between the transmitter and the k-\h 
receiver: 


H 


k = 




-Rx,/c; 


0 2 TJ 0 2 


where is an x matrix whose elements follow the 
independent and identically distributed (i.i.d.) complex zero- 
mean, unit variance Gaussian random distribution, and RRx,k 
and RTX,k are the spatial correlation matrices at the k-th 
receiver and the transmitter. To simplify ULA modeling, the 
correlation matrix RTX,k foi* the k-th receiver is expressed 
as ( 23 : 

[RtxMp,, = 

J-A+cf>k 


^Throughout this paper, we use upper and lower case boldface to describe 
matrix A and vector a, respectively. The transpose and the Hermitian transpose 
of a matrix is notated as (•)^ and (•)*, respectively. The vec(-) operator stacks 
the columns of a matrix into a vector. E[-] denotes the expectation operator. 
(X) denotes Kronecker product. 
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The modified OMP method (Scenario 1) 
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Random Measurement Matrix 
$ e ^MxN 


Vectorize Channel Matrix 

h = yec{H) 

where N = NtNj. 



Sparse 

Representation 

s = ^h 


The dimensionality reduction method (Scenario 2) 

Figure 2: A schematic of the proposed MIMO channel feedback methods. 


where A is the carrier wavelength, A is the angular spread, 
and (pk is the angle of arrival (AoA) for the k-th receiver. 
The correlation matrix of the UFA model can be expressed by 
combining the vertical correlation matrix Ry G and 

the horizontal correlation matrix 'Rh,/c ^ ilyQ 

Kronecker product RTX,k = Rv ^ Rii,k- The angular spread 
and the AoA for vertical and horizontal correlation matrices 
are given as 


Av 

Ah 

0H,/c 


K 

K 


arctan 

arctan 


u 

s + r 
u 


arctan ( - 
(-7i‘,7r], 




— arctan 

H- arctan 



where u, r, and s are the elevation of the transmit antenna, 
the radius of the scattering ring for the receiver, and the 
distance from the transmitter, respectively. For simulations, we 
set u = 60m, r = 30m, and s = 100m. The channel matrix 
including all receivers is formed by stacking, column-wise, 
the channel matrices between the transmitter and each receiver 

... f. 


III. Background 

In this section, we briefly review how CS works and 
discuss the importance of the original signal’s sparsity in 
reconstruction. To encode sparse signals, several compression 
methods are available. Some signals have sparsity themselves 


while others can be sparsifled in some domain that makes only 
a few dominant coefficients sufficient to represent the signals. 
Consider an A x 1 target signal x, which can be sparsifled 
into an A X 1 sparsifled signal s with 3.n N x N sparsifying 
basis ^ as 

s = 

where s has at most only K non-zero elements. This type of 
signal s is called Ff-sparse. If the target signal x has sparsity 
itself, the sparsifying basis ^ can be an identity matrix. The 
commonly used examples of ^ include the discrete Fourier 
transform (DFT) matrix and the discrete cosine transform 
(DCT) matrix. Since such transformations are usually or¬ 
thonormal, the target signal can be represented as a: = 

In practice, it is hard to expect the sparsifled signal s to be 
sparse. In such a case, s is assumed to be noisy-sparse, which 
has K dominant elements and {N — K) negligible elements. 
To compress the target signal x, we introduce two methods: 
1) CS, and 2) the dimensionality reduction. 

A. Compressive Sensing 

The greatest advantage of CS is not needing to know the 
indices (positions) of the non-zero elements in s. With CS, the 
target signal x is blindly encoded as an M x 1 measurement 
vector y via random projections as: 

y = (1) 

where $ is an M x A measurement matrix, which can be 
generated randomly according to the distributions such as 
Gaussian or Bernoulli. The compression capability is bounded 






























4 


SIM ETAL.\ COMPRESSED CHANNEL EEEDBACK EOR CORRELATED MASSIVE MIMO SYSTEMS 


as M > cKlog^ for some small constant c [ [T8| , [ [T^ . The 
compression ratio r] is calculated sls r] = M/N. 

Since ^ is a wide matrix, y = is an undetermined linear 
system of equations. To reconstruct x from y, the decoder 
solves the following ^i-norm minimization problem: 

min||s||^, s.t. y = ^^s, 

which is typically solved by optimization algorithms such as 
basis pursuit (BP). The decoder can also reconstruct s by 
greedy algorithms such as orthogonal matching pursuit (OMP) 
12^ . The exact reconstruction of x is guaranteed with high 
probability by the Restricted Isometry Property (RIP) of 

GD- 

B. Encoding by Dimensionality Reduction 

It is an intuitive step to compress x by encoding the 
dominant elements in s by dimensionality reduction. In this 
case, the information on indices of such elements has to be 
known at the encoder, and also has to be fed back to the 
decoder. With some sparsifying basis however, the position 
of dominant elements in s is expected to be in certain region. 
Therefore, the encoder and the decoder can fix the order of 
encoding/decoding s. For example, in image processing, JPEG 
uses the 2D-DCT as a sparsifying basis and encodes low 
frequency data priorly. 

IV. Massive MIMO Channel Feedback 

In this section, we introduce the two-dimensional discrete 
cosine transform (2D-DCT) and the Karhunen-Loeve trans¬ 
form (KLT) as a sparsifying basis. We also explain how the 
receiver encodes and feeds back CSI to the transmitter. To 
reduce feedback overhead, we propose to compress CSI into 
an M X 1 vector via random projection or dimensionality 
reduction. With each sparsifying basis, we specify the position 
of the dominant elements in the sparsified CSI vectors. 

A. Sparsifying Basis 

An efficient sparsifying basis is needed to reconstruct the 
compressed sparse signal with lower error. In practical cases, 
the sparsified signal s may have K dominant elements and 
other {N — K) elements may not be zero, which means it 
would not be Ff-sparse. Since the reconstruction algorithms 
assume that the sparsified signal is Ff-sparse, they reconstruct 
only K elements. Other elements are considered as errors. 
Therefore, it is important to use an efficient sparsifying basis 
that makes non-dominant elements smaller. 

To handle CSI easily, Hj^ is vectorized into an A^r^t x 1 
vector 

hk = vcc{Hk). 

For convenience, we omit the supscript k. We design a 
sparsifying basis ^ to sparsify h. The sparsifying performance 
of ^ plays a key role in reconstruction in both compression 
methods with the fixed compression ratio p = M/{Nj.Nt). 


Cj^HC jVt 



(a) The 2D-DCT basis (b) The KLT basis 

Figure 3: The order of selecting dominant elements in sparsi¬ 
fied CSI with the sparsifying bases. 


1) The Two-Dimensional Discrete Cosine Transform Basis: 
Due to the spatial correlation among the antennas, H is 
expected to be sparse in spatial-frequency domain. The 2D- 
DCT is widely used in lossy compression of audio and 
images because of its strong energy compaction property and 
simplicity of computing. Also, if the 2D-DCT is chosen to 
be used as a sparsifying basis, there is no need to calculate 
the basis, meaning the basis is fixed. Note that the matrix 
operation of the 2D-DCT can be written as HC^t^ where 
Cat is the X DCT matrix. This can be written in a vector 
form as: 

Sdct = (C^ATt (8) CArJ^vec(FZ‘) = (Cat^ 0CNr)^h. 

Therefore, a sparsifying basis with the 2D-DCT is ^dct = 
Nr)- An advantage of the 2D-DCT as a sparsifying 
basis is that ^dct is fixed even though the correlation of the 
channel changes. In other words, the receiver and the transmit¬ 
ter need not to calculate ^dct as the correlation changes. Since 
2D-DCT ignores information on how the channel is correlated, 
however, the sparsifying performance is limited. 

2) The Karhunen-Loeve Transform Basis: We assume that 
the spatial correlation of the channel is fixed. Therefore, we 
can employ the sparsifying basis spanned by the eigenvectors 
of the covariance of h. The covariance Ch can be 
formulated as 

Ch = E[M*]. 

Since Ch is a Hermitian matrix, the proposed sparsifying ba¬ 
sis, ^klt, can be computed by the eigenvalue decomposition: 

Ch = ^kltA^klt = ^kltA^^x, 

where ^klt is a matrix consisting of normalized eigenvectors 
of Ch, while A is a diagonal matrix whose elements are 
corresponding eigenvalues. With the proposed basis ^klt, the 
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(d) Re(/i) with an UFA. (e) Re(sDCT) with an UFA. (f) Re(sKLT) with an UFA. 

Figure 4: The real part of CSI and sparsified CSI with two bases, (a), (b), (c) are the results with an ULA of antennas, and 
(d), (e), (f) are the results with an UFA of antennas. 


covariance of the sparsified CSI vector Sklt = ^^t^ 
calculated as 

C^klt = E[-Sklt^klt] = IE[^klt^^*^klt] 

= ^^tE[M*]^klt = ^^t^/i^klt = A. (2) 

The elements of s are independent of each other, and the 
variance of the i-th element of s is the i-th eigenvalue A^. 
Due to the high correlation in the channel, A has only a few 
dominant elements, which means the proposed basis provides 
the powerful sparsifying performance. 

To obtain the KLT basis, the channel covariance must 
be estimated. It is reasonable to assume that Ch changes 
slowly compared to the coherence time of the channel H. 
Furthermore, it is known that is frequency invariant for the 
wide-sense stationary uncorrelated scattering (WSSUS) fading 
model 12^ . Ch, therefore, can be obtained at the transmitter 
(or BS) by the uplink in FDD systems, or by subspace tracking 
algorithm using the downlink training]^ 


B. Proposed Channel Compression and Feedback 

The sparse signal compression techniques explained in 
Section 11^ can be applied to compress the channel feedback. 
As mentioned in Section |T| we consider two scenarios. In 
Scenario 1, since the receiver does not know which basis 
will be used, the CS technology is adopted, and the CSI is 
compressed via random projection. In Scenario 2, the receiver 


^One might argue that it is not true for large frequency separation. In this 
paper, however, for simplicity, we assume the covariance is equal in different 
frequency band. 


knows which sparsifying basis will be adopted. Therefore, 
CSI can be compressed by dimensionality reduction. Figure 
shows the schematic of the CS-based and the dimensionality- 
reduction-based MIMO channel feedback methods. 

1) Specifying the Indices of Dominant Elements in Spar¬ 
sified Channel Information: With the sparsifying bases intro¬ 
duced in Section [TV-AI the indices of the dominant elements in 
sparsified CSI 5 are expected to be focused on certain region. 
2D data from nature such as pictures tend to have the most 
energy in low frequency. From this observation, the indices 
of the dominant elements of the sparsified CSI from the 2D- 
DCT basis can be specified in low frequency. In the case of 
selecting AT-dominant elements, K elements are selected in 
zig-zag order in a matrix form of sparsified CSI, HC 
illustrated in Figure [^a). With the KLT basis, the variance of 
each element of sparsified CSI s is determined by eigenvalues 
A of By rearranging the columns (eigenvectors) of ^klt, 
the eigenvaues A can be ordered in a descending order. To 
select K dominant elements in sparsified CSI s, therefore, the 
first K elements of s are selected as illustrated in Figure |^b). 

Figure |^hows that the proposed selecting order described 
in Figure is reasonable. We generate the channels with 
Ni = 64, = 64, (i = 0.1 A. Figure Qa) is the real 

part of the channel matrix H with an ULA of antennas. 
Figures Qb), Qc) are the sparsified forms of H with the 2D- 
DCT basis, and the KLT basis, respectively. To check that the 
encoding order is reasonable in Figure the sparsified CSI 
with the 2D-DCT basis is represented in a matrix form, and 
the sparsified CSI with the KLT basis is represented in a vector 
form. Figures Qd), Qe), [Jf) show the same things, but with 
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Table I: OMP algorithm 


Table II: Modified OMP algorithm 


Input: 

measurements y 
measurement matrix ^ 
sparsifying basis ^ 
sparsity K 

Initialize: 

iteration coount k = 0 


residual vector = y 
estimated support set 

While k<K 

k = k ^ 


= arg max (/)j) 


Sj^k = argmin \\y — ^j^ks \\2 

X 

End 

—y _ ^rj^kgr^k 

Reconstruction: 

s = arg max ?/ — 2 

aj:supp(aj)=T^ 

Output: 

ft = 

Complexity: 

0{NrNtMK) 


an 8 X 8 UPA of antennas on both transmitter and receiver. 
For convenience, in the rest of this paper, we rearrange the 
columns of two sparsifying bases, ^dct and ^klt, so that the 
first K elements of s are selected as dominant elements. 

2) Compressive Sensing-based Feedback with Modified 
OMP (Scenario 1): In Scenario 1, the receiver does not know 
whether the transmitter can exploit the KLT basis. Note that 
the compression part of CS does not need a sparsifying basis, 
random projection is used for compressing the CSI. Using 
random projection for compression, the transmitter can, for 
reconstruction, use either the 2D-DCT basis or the KLT basis. 
Since the sparsifying performance of the KLT basis is better 
than that of the 2D-DCT basis, the transmitter adopts, if it 
can, the KLT basis. 

In CS-based compression, according to 0, the X 1 

CSI h is encoded into the M x 1 measurement vector y via 
random projections: 


y = ^h = 

where an M x A^r^t measurement matrix $ is generated by the 
i.i.d. Gaussian distribution with zero-mean, and unit variance, 
and we assume that both the receiver and the transmitter share 
After the transmitter obtains the compressed data y, it 
reconstructs the channel h. The reconstruction algorithms of 
CS, including BP and OMP, reconstruct the sparsified signal 
s and multiply ^ to get h = 

OMP is a widely used algorithm due to its low com¬ 
putational comprexity. The complexity can be expressed as 
0{Nj.NtMK) in a linear funciton of sparsity level K | [3T| . It 
iteratively investigates the support of the sparsified signal. In 
each iteration, the correlation between each column of and 
the modified measurements (so called residual) are compared 
to identify the elements of the support as explained in Table 
OMP, therefore needs AT-iterations for reconstruction. 

The support of dominant elements in s, can be specified 


Input: 

measurements y 
measurement matrix $ 
sparsifying basis ^ 
reconstruction parameter Kp 

Dominant basis: 

consists of Kp columns of ^ 
selected as in Section IV-Bl 

Reconstruction: 

Si = ($'®'i)tj/ 

Output: 

ft = ’J'lSi 

Complexity: 

0{NrNtM) 


without iterations, as explained in Section |IV-B1[ Since the 
order of selecting the dominant elements is known, only the 
number of the dominant elements has to be determined. Let 
^p( < M) denote the number of the dominant elements to 
be reconstructed and be determined empirically considering 
M and sparsity of s. In the later section, we suggest some 
intuition to choose proper ATp. Modified OMP, therefore, is 
proposed as Table with complexity of 0{Nj.NtM). The 
sparsified signal can be represented by the sum of the dominant 
elements part and the non-dominant part: 


s = 


Si 

0 i 


O2 

S 2 


where Si and S 2 represent the Kp x 1 dominant part and the 
{N^Nt — Kp) X 1 non-dominant part of the sparsified signal, 
respectively, and 0i and O 2 represent an (N^Nt — Kp) x 1 zero 
vector and a Kp x 1 zero vector, respectively. The sparsifying 
basis can also be separated as: 


’S' = [ $1 $2 ] , 

where and ^2 are an A^rA^t x ATp and A^rA^t x (A^r At — ATp) 
matrix, respectively, which consist of the columns of Since 
Kp < M, the Kp X 1 reconstructed dominant elements Si is 
obtained as: 

Si = 

= 181+^282) 

= Si + ($'J'i)t$’®’2S2, 

where denotes the Moore-Penrose pseudoinverse. The 
reconstructed CSI h is obtained by inverse transform: 

h = ^isi. 

The squared error of CSI reconstruction is expressed as: 

MSE,, = lift - ftlli = ||($®i)t$$2S2||i + ||S2||i. (3) 

3) Dimensionality Reduction based Feedback (Scenario 2): 
In Scenario 2, the receiver knows whether the transmitter 
can exploit the KLT basis. If both the transmitter and the 
receiver can exploit the KLT basis, the KLT basis is adopted 
as a sparsifying basis. If either the transmitter or the receiver 
cannot exploit the KLT basis, the 2D-DCT basis is used as the 
sparsifying basis. Since the sparsifying basis ^ is orthonormal. 
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Figure 5: (a) The average normalized MSB of quantized channel feedback with variation of bits used for quantization, (b) Sum 
rate comparison with perfect CSI, and compressed CSI feedback with = 4. Both simulations are with an 8 x 8 UFA of 
antennas at the transmitter, = 1, and r] ^ 0.047. 


the M X 1 compressed CSI y and the A^r^t x 1 reconstructed 
CSI h can be obtained: 

y = S3 = %h, 

h = 

respectively, where S 3 is an M x 1 vector consisting of the 
first M elements of s and ^3 is an A^r^t x M matrix consists 
of the first M columns of 

C. Codebook for Compressed Channels 

Vector quantization (VQ) p^ , is a widely used and an 
efficient technique for data compression. It can be applied for 
limited feedback in wireless communications. The objective 
of VQ is to represent a set of input vectors v e V C hy 
a set, C = {ci, • • • ^Cnc} C of Nc code vectors. C is 
called codebook. VQ can be represented as a mapping: 

Q:V ^C. 

With the function Q, it is possible to define a partition S 
of set V. It is constituted by the encoding region Si C 
corresponding to the code vector Ci as: 

Si = {veV\Q{v)=Ci}. 

To evaluate how a vector v is approximated by c^, a distance 
metric D is defined: 

D{v,Ci) = y{v -Ci)*{v -Ci). 

The mean quantization error (MQE) is defined with the fixed 
codebook C and partition S\ 

MQE{C,S) =E[L>(t;,g(t;))]. 

For codebook generation, this paper adopts RVQ p^ , p3| 
and the LBG algorithm 0- In RVQ, the codebook C, which 
is known to both the transmitter and receiver, is randomly 


generated each time the channel changes. The LBG algorithm 
is an iterative algorithm that uses a training set to solve 
two optimality criteria which minimize the MQE: the nearest 
neighbor condition, and the centroid condition. Given a fixed 
codebook C, the nearest neighbor condition assigns the nearest 
code vector to each input vector. In other words, the encoding 
region Si is obtained by the Voronoi partition p^ : 

S'i = N e V\D{v,Ci) < D{v,Cj),j i}. 

Given a fixed partition S', the centroid condition finds the 
optimal codebook constituted by the centroid of each encoding 
region. Therefore, a code vector Ci can be obtained as: 

a = Vi E Si. 

A whole iteration of the LBG algorithm obtains (m + l)-th 
codebook Cm-\-i from m-th codebook Cm by executing two 
operations: the calculation of the Voronoi partition of V by 
adopting the codebook and the calculation of the code¬ 
book Cm-\-i whose elements satisfy the centroid condition. 
This iterative algorithm is repeated until the MQE converges 
to such value. 

After the codebook and the partition are obtained through 
the iterative part, the splitting part increases, using the obtained 
codebook, the size of a codebook. The commonly used split¬ 
ting algorithm doubles the size of codebook by splitting into 
(l + e)Cn, and (1 — e)Cn, where e is a small constant. After the 
splitting part, the iterative part optimizes the codebook and the 
partition. These two parts of the LBG algorithm are repeated 
until the desired size of a codebook is obtained. 

In this paper, since a CSI vector is to be quantized, the 
codebook C consists of randomly generated channel vectors 
h with fixed correlation matrices. To obtain the 6-bit LBG 
codebook for the M x 1 compressed feedback vector y, two 
training sequences-sets of input vectors-are generated for two 
scenarios. For Scenario 1, the training set consists of randomly 
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(a) Using an ULA (b) Using an UPA 

Figure 6: The average normalized MSE of channel feedback for an ULA and an 8 x 8 UPA with = 64, = 1, and 

d = 0.1A 


projected CSI; the training set for Scenario 2 consists of the 
first M elements of sparsified CSI. Since the LEG algorithm 
has an optimizing part, it is quite straightforward that the MQE 
of LBG-based quantization is lower than that of RVQ. The 
only defect of the LEG algorithm is a need for computing 
resourse. For the channel whose correlation matrix can be 
assumed to be static, the quantization performance can be 
improved through the LEG algorithm. 

V. Performance Analysis 

In this section, we justify that, in a massive-MIMO system, 
the use of highly correlated channels (using small antenna¬ 
spacing d) outperforms the use of uncorrelated channels 
(using large d). We also compare the performance of three 
compression methods: the conventional CS-based compression 
method and the proposed compression methods with different 
sparsifying bases. 

A. Highly Correlated Channel (d = 0.1 Aj v^’. Uncorrelated 
Channel (d > 10Aj 

It is well-known that higher data rate is achieved with an 
uncorrelated channel, not a correlated one. If the channel 
is uncorrelated, however, it is hard to compress, compelling 
enormous amounts of data to be fed back. Contrarily, a 
correlated channel can be compressed efficiently, which means 
the transmitter can obtain more accurate precoding vectors. 
In summary, a highly correlated channel provides lower 
achievable rate, but enables the transmitter to exploit better 
precoding vectors. The channel H only needs enough number 
of not-close-to-zero singular values (effective rank) to support 
receivers. When is relatively small compared to Nt 
and Nr, a correlated channel has enough effective rank to 
support A/'u receivers. With better precoding performance, a 
correlated channel can perform a higher sum rate compared 
to an uncorrelated channel in limited feedback scenarios. 


Figure shows the reasonableness of this discussion. We 
design three types of the transmitters: an 8 x 8 UPA of antennas 
with d = O.IA, d = 0.5A, and d = lOA. Assume there are 4 
receivers and each receiver has one antenna. Each receiver 
compresses a 64 x 1 CSI vector h into a 3 x 1 encoded vector 
y (compression ratio r] « 0.047) by the random projection- 
based and dimensionality-reduction-based compression with 
the KLT basis. Also, y obtained via dimensionality reduction 
is quantized with the LBG algorithm, and h is quantized with 
RVQ. In this paper, we adopt a MMSE precoder. Figure [^a) 
plots the average normalized mean square error (MSE) of 
quantized CSI with variation of bits used. It shows that the 
MSE of quantized CSI with small d is much lower than that 
with large d. Figure |^b) shows that there is only small loss 
of achievable sum rate when CSI undergoes compression with 
the correlation of the channel is high, but the loss is big 
with the less correlated channel. Therefore, a higher sum rate 
is acheivable with a highly correlated channel with limited 
feedback. 

B. Performance of Single-User MIMO Systems 

The simplest way to compare the performance of the 
channel feedback is to compare the average normalized MSE 
between the original h and the fed back h. We design single- 
user MIMO systems with a 64 ULA and an 8 x 8 UPA 
at the transmitter and a single antenna at the receiver with 
d = 0.1 A. The reconstruction parameters for the modified 
OMP are ATp = 9, and 6 for an ULA and an UPA, respectively, 
with the KLT basis, and = 19 with the 2D-DCT basis. 

Figure shows that CSI is fed back with less error with the 
KLT basis than with the 2D-DCT basis. Compared to compres¬ 
sion error from conventional OMP, both proposed compression 
methods efficiently decreases compression-error, which means 
CSI can be compressed with a lower compression ratio. Re¬ 
calling that both proposed compression methods calls for less 
computing resourses, we can conclude the proposed methods 
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SNR(dB) 


Figure 7: Sum rate comparison using a MMSE precoder 
computed with perfect CSI, compressed CSI by conventional 
OMP, by the proposed methods, and by the proposed 10-bit 
codebook. The system is designed with = 64, A^r = 1, 
A/'u = 4, d = 0.1 A, and r] = 0.047. An UPA array is 
implemented on the transmitter. 


perform better. In the UPA case. Figure |^b) shows that, when 
the KLT basis is used, compression-error is lower than the 
ULA case. Due to the high correlation, the KLT-sparsity of 
the sparsified CSI is lower with an UPA. 

When 2D-DCT is used for sparsifying, however, there are 
other dominant elements out of the zig-zag-selected elements. 
In Figure Qe), we can see some extra peaks. These peaks are 
generated because of an order of antenna indexing in an UPA. 
As we can see in Figure [Jb), the correlation or the distance in 
an UPA does not, unlike an ULA, continually decreases while 
an index of antenna increases, but increases periodically. For 
instance, in case of Figure [^b), pij is larger than pis. Due to 
the geometry of an UPA, therefore, the 2D-DCT is not proper 
for sparsifying UPA CSI. 

In Figure we can observe that MSE of modified OMP 
using the 2D-DCT basis, compared to others, is abnormally 
high when 77 is around 0.3, which means M ^ = 19. From 

the intuition that the condition number of the M x matrix 
is big when Mf is close to 1 p^ , p^ , we conclude 
the increase of MSE is due to the pseudoinverse term in 
In other words, if is a square matrix or the numbers of 
columns and rows are similar, the linear system y = 
becomes sensitive to error term $^ 252 - therefore, should 
not be chosen similar to M. For the case using the KLT basis, 
since the residual error term is small enough, there is no MSE 
peak. 

C. Performance of Multi-User MIMO Systems 

The accuracy of the channel feedback in a multi-user 
system can be measured by the achievable rate, which is the 
performance of the precoding. In the system of At = 64, 
Ar = 1, Au = 4, and d = 0.1 A, with an 8 x 8 UPA array, we 
calculate the sum rate using a MMSE precoder with different 


channel feedback methods with 77 = 0.047. The sum rate with 
perfect CSI is calculated as the theoretical upper bound, and 
the sum rate with fed back CSI using conventional OMP with 
2D-DCT is calculated as reference data (TT) We compare the 
sum rates with compressed CSI by three different compression 
methods: the conventional CS-based methods using either 
OMP or modified OMP as reconstruction algorithms, and the 
dimensionality reduction method. Each method is simulated 
with two kinds of sparsifying bases. Figure shows that 
the performance of the proposed methods is better than the 
conventional one. The reconstruction parameters are Ap = 3 
and 4 for modified OMP when the sparsifying bases are 
the KLT basis and the 2D-DCT basis, respectively. We also 
simulate with the limited feedback with the 10 bit LEG 
codebooks. 

VI. Conclusions 

This paper proposed sparsifying-based compression mech¬ 
anisms to reduce the load of the channel feedback in spatially 
correlated massive MIMO systems. We adopted the KLT basis 
as sparsifying basis. Using the fact that the indices of the dom¬ 
inant elements in the sparsified CSI, with the particular sparsi¬ 
fying basis, can be specified, we proposed modified OMP for a 
reconstruction algorithm of CS, and dimensionality-reduction- 
based compression. For the limited feedback, we applied the 
LEG algorithm to generate a codebook. We suggested that 
using highly correlated channels could maximize achievable 
data rates better than using uncorrelated channels considering 
the accuracy of the channel feedback in massive MIMO sys¬ 
tems. Future work will consider practical issues such as finding 
a proper reconstruction parameter Ap, correlation estimation, 
and quantization errors. 
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